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ATTITUDE FILTERING ON SO(3) 

F. Landis Markley* 

A new method is presented for the simultaneous estimation of the attitude of a 
spacecraft and an A- vector of bias parameters. This method uses a probability 
distribution function defined on the Cartesian product of SO(3), the group of 
rotation matrices, and the Euclidean space .The Fokker-Planck equation 
propagates the probability distribution function between measurements, and 
Bayes’s formula incorporates measurement update information. This approach 
avoids all the issues of singular attitude representations or singular covariance 
matrices encountered in extended Kalman filters. In addition, the filter has a 
consistent initialization for a completely unknown initial attitude, owing to the 
fact that SO(3) is a compact space. 

INTRODUCTION 

The extended Kalman filter (EKF) 11-7] is the workhorse of real-time spacecraft attitude estimation. Since 
the group SO(3) of rotation matrices has dimension three, most attitude determination EKFs use lower- 
dimensional attitude parameterizations than the nine-parameter attitude matrix itself. The fact that all three- 
parameter representations of SO(3) are singular or discontinuous for certain attitudes [8] has led to extended 
discussions of constraints and attitude representations in EKFs [6, 7, 9-11]. These issues are now well 
understood, however, and the EKF has performed admirably in the vast majority of attitude determination 
applications. Nevertheless, poor performance or even divergence arising from the linearization implicit in 
the EKF has led to the development of nonlinear filters, most recently sigma point or unscented filters 
[12-15] and particle filters [16-19]. These filters have been shown to perform better than the EKF in some 
cases at the expense of increased computational cost. 

The optimal solution of the nonlinear estimation problem requires the propagation of the probability 
density function (PDF) of the state conditioned on the observation history [2, 19]. All practical nonlinear 
filters are approximations to this ideal. Exact finite dimensional filters [21] can be found that solve some 
nonlinear problems by using the Fokker-Planck (or forward Kolmogorov) equation [2, 20] to propagate a 
non-Gaussian PDF between measurements and Bayes’s formula [1, 2] to incorporate measurement 
information. The filter we propose follows this pattern, but does not solve the nonlinear attitude filtering 
problem exactly. We refer to it as an orthogonal filter, because it represents the attitude by an orthogonal 
rotation matrix, rather than by some parameterization of the rotation matrix. The PDF is a non-Gaussian 
function defined on the Cartesian product of SO(3), the group of rotation matrices, and the Euclidean 
space R n of bias parameters. This filter entirely avoids questions about singularities of representations or 
covariance matrices arising in EKFs [6, 7, 9, 10] and unscented filters [15], and has the additional advantage 
of providing a consistent initialization for a completely unknown initial attitude, owing to the fact that 
SO(3) is a compact space. This approach is closely related to earlier work by Lo but has the advantage of 
including both process noise and parameters other than the attitude itself [22, 23]. 
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The paper begins with a review of some well-known facts about Gaussian PDFs to establish notation and 
to provide a background for the orthogonal filter. Then we introduce our parameterization for the attitude- 
only PDF and discuss some of its properties. With this preparation complete, we turn to the specification 
of the PDF for the combined estimation of attitude and an N- vector of bias parameters. We discuss the 
measurement update of the PDF by Bayes’s formula and the propagation of the PDF between measurements 
by the Fokker-Planck equation, deriving explicit formulas for a specific measurement model and a specific 
dynamic propagation model. We then investigate the Cramer-Rao lower bound on the covariance, which 
gives a useful estimate of filter performance [1]. We test the new algorithm in a simulation of an Earth- 
pointing spacecraft equipped with gyros and a three-axis magnetometer, and conclude by summarizing the 
pros and cons of the orthogonal filter. 

GAUSSIAN PDF 

Let x^ denote an N-dimensional state vector at time ^ and Y v denote the set of measurements up to and 
including time t v . Note that we will use Greek subscripts to denote time or measurement indices, to avoid 
confusion with Latin subscripts labeling matrix and vector components, and that we will be rather cavalier 
in distinguishing between random processes and their realizations. The Gaussian PDF with mean and 

positive definite covariance for x^ conditioned on the measurements Y v is 

I Y v (x) = [( 2j *) N det P^ w r 1/2 exp[- ^ (x - x^ lv ) r P~^ (x - x^ )] = exp[- J Xfi ,y y (x)] . (1) 

The negative-log-likelihood function J x ^ lY ^(x) is given by 

J x„ I Y v (X) = i(x- x„ lv ) r (X-X llW ) + ... = %X T F * l iyX - v x + . . . , (2) 


where F* w = P^l is the positive definite information matrix and = P^ \ l v x^ v . The ellipses denote 

omitted terms that are independent of x. The dependence on the measurement history Y v is not exhibited 
explicitly in the arguments of p x ^\y v (x) and «/ X/i \y v ( x ) * b ut * s implicitly indicated by the subscripts p I v 

on the various parameters in the distribution. This PDF has ^N(N + 3) free parameters, y N(N + 1) in the 
information matrix and N in the vector , which contains information about the mean. 

There are three estimates of interest: the conditional expectation 

Xjuiv s I x P\p ,y v (x) X , ( 3 ) 

r n 

the maximum a posteriori probability (MAP) estimate 


and the minimum mean-square error (MMSE) estimate 


MMSE _ argmin 


Vlv 


\'eR‘ 


J |x-xf p w (x)d N x 


LE" 


H lI v 


arg min 


x'eE 


N 


J |x - X#l|v | p ,y v (X) d N X + |x Ml v - x'\ 


(4) 


(5) 


Le 


It is easy to see that all three estimates are identical and are equal to x^ v for a Gaussian PDF. It is an 

interesting exercise to show that the Fokker-Planck equation and Bayes’s formula lead to the usual Kalman 
filter propagation and measurement update for linear dynamics and measurement models, respectively [2]. 
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We define the PDF of the attitude p A \ Yy (A) so that p A ^ Yy (A)dA is the probability that the attitude 
matrix is in some volume dA in SO(3) around A. The volume element dA is defined by the Haar measure 
on SO(3), which is unique up to a multiplicative constant [24, 25]. In parallel with the Gaussian case, we 
write the PDF in terms of a negative-log-likelihood function as 

Pa m iv v (A) = exp[ -J Aft ,y v (A)] . (6) 


The specification of the negative-log-likelihood function appeals to the Peter-Weyl Theorem, which states 
that a continuous function on a compact Lie group can be uniformly approximated by a linear combination 
of the matrix elements of the unitary irreducible representations of the group [26-28]. For the rotation 
group SO(3), these matrices are the (2£ + 1) x (2£ + 1) rotation matrices D c for integer £ [27, 29], so we 
have 


OO t t 

,(A)=I S I 

(=0 m--i m'=-£ 


c, D , 


(7) 


where the c^ m are scalar coefficients. For numerical work, this series on SO(3) must be truncated at some 
finite £ m2LX in the same sense that equation (2) can be regarded as a Taylor series in x on truncated at 
order 2. A PDF in the form of equation (7), truncated at £ m3X = 2 , was first considered by Lo [22, 23], who 
referred to it as an exponential Fourier density. We will truncate the series even more drastically at 
£ max = 1 , but we will include both process noise and non-attitude bias parameters, neither of which was 
considered by Lo. The appendix contains some preliminary thoughts on extending the representation to 
£ = 2. We prefer not to deal with the complex matrices D e , so we write the truncated equation (7) as 

J^ Yv (A) = -tr(Bl w A) + ..., (8) 

where B^ v is a real 3x3 matrix and the ellipses denote terms independent of A that are determined in 
principle by the normalization of the PDF. We will see that it is not necessary to compute these terms. 

The unitary equivalence of equations (7) and (8) is established by the relations [29] D° = 1 and 


D 1 (A) = M/AMj , 

where M x is the unitary matrix defined by 


M,- 


-1/V2 0 

-i/y/2 0 
0 1 


l/V2 " 
-i/V 2 , 
0 


(9) 


( 10 ) 


with the dagger denoting the Hermitian conjugate (transpose complex conjugate) matrix. 

All the information about the PDF is contained in the real 3x3 matrix B ^ v , which has been known for 

some time to contain the correct number of free parameters to represent the mean and covariance of a state 
with three degrees of freedom, namely three parameters for the mean and six for the attitude covariance [30]. 
In fact, a filter based on equation (8) will look very much like Shuster’s Filter QUEST [31]. The 
representation of the PDF for spacecraft attitude given by equations (6) and (8) is completely free of 
singularities, discontinuities, or constraints. Shuster has pointed out another advantage of defining the PDF 
on the compact manifold of SO(3) rather than on some 3-dimensional Euclidean tangent space, namely that 
it allows us to represent a complete lack of attitude knowledge by a uniform probability distribution [32]. 
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It is possible to define three estimates in parallel with the Gaussian case. The conditional expectation 

Afiw 55 / ApA\Y v (A)dA ( 11 ) 

SO(3) * 


does not provide a valid attitude estimate, however, because it is not an orthogonal matrix in general, i.e. it 
does not belong to SO(3). This has led to the erroneous belief that PDFs on non-Euclidean manifolds 
necessarily yield unphysical estimates [11]. In contrast to this, Oshman and Carmi [19] have emphasized 
that perfectly valid attitude estimates are provided by the MAP estimate 


^ MAP _ argmax [" ( A _ arg max 

“ AGSO(3)L^V r v AeSO(3) 


[tr(<i V A)] 


( 12 ) 


and the MMSE estimate 


jMMSE _ arg min 


>lv 


A'gS0(3) 

_ arg min 
~~ A'eSO(3) 


J || A - A'\\ p p A | y W) 

SO(3) h 


S j(3) I' 4 ~ Afi,V PA * ^ )dA + II V " ^l' 2 


arg min II j _ ,11 

A'eSO(3) W ^ v II 


arg min 
*” A'gSO(3) 


Equation (13) employs the Frobenius norm, defined by 


S tr(MM ) = I Mfj . 

ij 


(13) 


(14) 


There are many ways to find the MAP and MMSE estimates [33], We will use the SVD method [34, 35], 
which is not the most computationally efficient, but has excellent numerical stability and provides the 
greatest analytical insight. We find the MAP by performing the SVD of the matrix B lv 


V = USV‘ = U + S'V ? , 


(15) 


where S is a diagonal matrix with non-negative elements in descending order of magnitude down the main 
diagonal, and where U and V are orthogonal, but not necessarily proper, matrices. The matrices in the right- 
hand member of equation (15) are defined by 

U + =U diagffl, 1, det £/]), V + = V diag([l, 1, det V]), and S' a S diag([l, 1, det U det V]) , (16) 

so that U + and V + are proper orthogonal matrices. Then the SVD method tells us that 


i MAP 
\\v 


= U + V l = U diag([l, 1, det U det V])V T . 


(17) 


The SVD method will return an attitude estimate for any B ^ v , but B must have rank greater than one 
for the estimate to be valid. The MMSE estimate is found similarly from the SVD of A ^ v : 


and 


A[i\ v = USV T = U + S'V+ 

(18) 

A MMSE _ fj yT 
A ^\v “ u + v + • 

(19) 


4 


We will now show that the MAP and MMSE estimates are identical in this case. Equations (8) and (15) 
give 


J A ,y v (A) = - tr <y + S'UZ A) + ... = - tr(S'Ul A V + ) + ... = - tr (S' A') + . . . 


with 


A' = UlAV + 


Then equation (11) gives 


V = \ Acxp[-J A , Y (A)]dA = U + \ J A' exp[ tr(S'A') + . . .] dA' 1 V + . 

SO(3) M [S0(3) J 


( 20 ) 


( 21 ) 


( 22 ) 


The invariance of the Haar measure allowed us to change the integration over dA to integration over dA' . 
We can use any convenient parameterization for A' to perform the integral, and we choose to use a 

quaternion [36, 37] with vector part q and scalar part-^1 - |q| 2 ; 


A'( q) = 


1 — 2 (ql + ql) 2^ 2 + ^ 3> /l-|q| 2 | 2^ 3 -^ 2> /l-|q| 2 j 

2^i« 2 -«3 V 1 - M 2 ) 1 - 2 (?3 + ?i 2 ) 2^ 2 <7 3 + ?1> /l-|q| 2 

2^i? 3 +42>/HqF) 2 |? 2 ?3 -< 7i^l - |q| 2 ) l-2(<7 2 +g|) 


(23) 


The square root singularity in this parameterization is of no concern, since the quaternion is only used as a 
dummy variable of integration, not as the filter’s attitude representation. The volume element in the 
quaternion parameterization, normalized so that SO(3) has unit volume, is [32] 

dA' = n~ 2 (1 - |q| 2 ) “ 1/2 d 3 q . (24) 

It is clear from inspection of equation (23) that all the off-diagonal elements of the matrix integral in 
equation (22) vanish by symmetry in q. Comparison of equations (18) and (22) then shows that 

U + =U + , V + =V + , and S' u = n 1 j A'. exp[ tr(S'A') + . . .] (1 - |q| 2 ) “ 1/2 d 3 q for 1 = 1,2, 3 (25) 

hP -i 


and thus that equations (17) and (19) give 

4 MMSE _ a MAP 


Note that the normalization of the PDF enters into S ' , but not into the attitude estimate. 


(26) 


ATTITUDE/PARAMETER PDF 

We now turn to the problem of simultaneously estimating the attitude and an A-dimensional vector of bias 
parameters. If these were independent, their joint PDF would just be the product of the PDFs of equations 
(1) and (6), so the negative-log likelihood function would be the sum of equations (2) and (8). This PDF 
would have \ N(N + 3) + 9 free parameters. The joint PDF won’t have this simple form when the attitude 

and bias parameters are correlated, of course. An EKF for this problem would have ^(N + 3 )(N + 6) or 

2- N(N + 3) + 3N + 9 parameters to describe the mean and covariance of the state with N + 3 independent 

components. The additional 3 N parameters contain information about the correlations between the three 
attitude parameters and the N bias parameters. 
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The terms in a Gaussian EKF describing the correlations between the attitude parameters and the bias 
parameters are linear in the components of the bias parameter vector and in the components of the attitude 
parameter vector. The obvious generalization for the orthogonal filter is to include terms in the joint PDF 
that are linear in both the bias vector and the attitude matrix, which leads to the ansatz 

J A„ I Yy x ) = 2 x T F vlv X - <t£lv x - trtflji v (*M1 + . . . , (27) 


where 


^/ilvOO “ Bfi\v t O 5- ^fi\v,k x k * 

*=1 


(28) 


As before, the ellipses denote terms that are independent of the state variables, i.e. of both A and x. Each 
V,* for k = 0 ,..., N is a 3x3 matrix independent of the state variables, so we have introduced 6 N more 

parameters than are used by an EKF. This can be viewed as either an advantage, in that the orthogonal filter 
carries more information about correlations, or a disadvantage, since it increases the computational effort. 
The number of extra parameters for 1 < N < 18 obeys the relationship 

2 < number of extra parameters in the orthogonal filter < 24 
7 number of parameters in an EKF ~~ 35 


It is not easy to establish that the MAP and MMSE estimates are equal for the correlated problem, and it 
may not even be true. Therefore, we will concentrate on the more easily computed MAP estimates, which 
we will denote by carets. These are found by simultaneously satisfying the equations 


^lv ~ 




(30) 


and 


~ __ arg min T _ 

xeR^ L 



(31) 


with J AiitX \ Yy (A,x mIv ) given by equations (27) and (28). The attitude estimate is given by equations (15) 
and (17), remembering to use . Defining a vector function T ]^, V (A) with components given by 

[lWG4)]* = tr(B£iv,,*A) for k = l,...,N. (32) 


allows us to rewrite equation (31) as 

x /iiv = {2 x3r ^)ii v x_ M^i v +1, Vi v (^/ii v )] T x- tr(Bj, v _ 0^1 v ) + - • •} . ( 33 ) 


so the MAP estimate of the parameter vector is 


x /ilv = (^,v) ‘[^Iv+^IvCV)] • (34) 

This problem can be solved by finding A u)v from the SVD of | V (x M , v ) using some initial guess for 
x M | V , then updating x^ !v using equations (32) and (34), and iterating this procedure until it converges. The 
literature may suggest better ways to find this simultaneous minimum, however [38^40|. 
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MEASUREMENT UPDATE 


We will consider the measurement update before we consider the time propagation between measurements, 
since it is somewhat simpler. We assume that we have propagated the state to the time using 
measurements up to the previous time Vi- The update of the PDF to incorporate the information in the 
measurement y M is given by Bayes’s theorem for probability densities [1, 2] 


Pa,,x,\Y,( A ’ X ) ~ 


Pyu I A 




I 1 Py u \A u ,x u 

SO(3) M N " » " 


(V^/Vx, iy, 

(y^l A,x)/> VVVi 


(A,x) 

( A,x)d N xdA ’ 


(35) 


where P y ^\A^,x^ (y M 1 ^>x)is the PDF for the measurement conditioned on the pre-measurement state. The 

denominator of equation (35) only contributes terms independent of A and x, which are not needed to 
compute the MAP, so the update in terms of the negative-log-likelihood function is 

j a ,, x , 05. (A ’ x) = Ja , V 1 (A ’ x) ■ ln[ Py, IV-*/. (y ^ IA ’ X)] + -"’ ( 36 > 


with the ellipses standing for omitted terms that are independent of A and x. 


The update can be performed exactly if the functional dependence on A and x of the right side of equation 
(36) is the same as the functional dependence of equation (27). If this is not the case, an approximation will 
be required. We will consider a specific example, but the method outlined can be extended in principle to 
quite general measurement models. Consider a measurement y M of the ambient magnetic field by a triaxial 
magnetometer with a constant bias x m in the spacecraft body frame and with Gaussian measurement errors 
with measurement information matrix (inverse covariance) F ™ . For this measurement model, 


- ■"[/>,„ \ a , ,X„ (y„ I A, x)] = ± (y, - AH, - \ m ) T F* (y, - AH^ - x m ) 

= iy^;y /i + \x T m F^ m -y T ,F;x m -{y,-x m ) T F^AH, + \H T ,A T F^AH,, 


(37) 


where is the magnetic field in the coordinate frame to which the attitude is referenced. All the terms on 
the right side of this equation except the last are of the desired form. The last term is actually independent of 
A and x if F" 1 is a multiple of the identity matrix. If F" 1 is not a multiple of the identity matrix this term 
has £ = 0 , £ = 1 , and £ = 2 parts [29]. Since the PDF does not contain an i = 2 part, we must discard this 
part of the last term on the right side of equation (37), giving the approximation 


2 AH, ~W,+ tr(*F^A) . 


(38) 


We evaluate the £ = 0 part i// M and the £ - 1 part ^ by means of the relationships [32] 


j dA — 1 , J AdA 0, J AijAj.fi dA ^8 lk 8j^ and J A^A k ^A mn dA 5 ^ikm^jin • (39) 

S0(3) SO(3) SO(3) S0(3) 


The first of these equations follows from the normalization of the volume of SO(3), and the second reflects 
the orthogonality of the £ = 0 and £ = 1 irreducible representations of SO(3). The third uses the Kronecker 
delta symbol S ik and the fourth uses the Levi-Civita antisymmetric symbol e ikm , defined by 

e i 23 = e 23 i = £312 = 1 * £132 = £321 = £213 = * e ijk = 0 if any two indices are equal. This gives 


and 


¥ 


,= j W,+U(%A)]dA=\ j (H T ,A T F;AH,)dA = ±\H,\ 2 trF; 

S0(3) SO(3) 

^ = 3 j A[^„+trCFjA)]dA = f j A (H T ,A T F™ AH,)dA = 0 . 

SO(3) SO(3) 


(40) 

(41) 


The vanishing of 4^ results from the antisymmetry of the Levi-Civita symbol and the symmetry of F ™ . 
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Substituting equation (37) into equation (36) and inserting equation (27) gives measurement update 

- y l F ™* m - (y„ - x m ) T FpAH^ +..., 


( 42 ) 


where the ellipses indicating neglected terms independent of A (i.e., having £ = 0 ) and x. The explicit 
update of the parameters in the PDF, assuming that the components of x m are the first three components in 
the bias vector x, follows from this: 


r?x _ ri , 

r H\H - r n\n-l ^ 


0(JV-3)x3 


^3x(A^-3) 

®(N-3)x(N-3) 


%\ fi = 4W-1 + 


®N-3 


(43) 

(44) 


Vo = Vi,» + Wr (45) 

= B^_ u - . for i = 1, 2, 3 (46a) 


1 ^ 3, (46b) 

where e f denotes the coordinate axis unit vectors: 

e^tlOOf, e 2 = [0 1 0] r , e 3 =[00lf. (47) 

In the attitude-only case, the only update is given by equation (45), which is the same as in Shuster’s Filter 
QUEST algorithm [31]. 


PROPAGATION BETWEEN MEASUREMENTS 


We now turn to the procedure for propagating the PDF from time to time t^ x , i.e. for propagating 
Pa , x i y (A,x) t0 Pa ,,x if (A, x) . We use the Fokker- Planck partial differential equation (PDE) for the 

PDF, or actually an equivalent PDE for the negative-log-likelihood function, to derive ordinary differential 
equations for the parameters , and in the PDF. The subscripts t\ fi are to convey the idea 

that these quantities will be integrated over t from to We will, however, suppress many subscripts 
and arguments throughout this section to simplify the notation. 


We begin by stacking the columns u, v, and w of the attitude matrix A in a nine-component vector x A 

u 

A = [uvw] and x A = 


(48) 


Then we define the (yV+9)-component state vector ^ as 



where x m is the vector of measurement biases and x d is a vector of dynamic biases. The Fokker-Planck 
equation is derived from the stochastic differential equation (SDE) [2, 20, 41] obeyed by the state vector 


(49) 


% 


d$ = fdt + Gdw , 


(50) 
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where dw is a Wiener process. Equation (50) should be considered as a Stratonovich SDE rather than as an 
Ito SDE for two reasons. First, solutions of a Stratonovich SDE, unlike those of an Ito SDE, obey the 
usual laws of calculus, including the product rule for differentiation. Thus the orthogonality constraint on 
A is preserved by a Stratonovich SDE, but would be violated by an Ito SDE. The second argument appeals 
to the Wong-Zakai theorem [2, 41], which says that, as a rule of thumb, the Stratonovich SDE represents 
the unphysical white-noise infinite-bandwidth limit of finite-bandwidth noise. It is reassuring that all these 
issues with SDEs can be forgotten after we have derived the Fokker-Planck equation. 


The Fokker-Planck equation arising from the Stratonovich SDE given by equation (50) is [20] 


dp _ *+» d(f iP ) ! »+> _9_ 
dt m 94 2 i.m= i 94 


G ik 


9 (Gjtp) 


(51) 


Substituting p = exp (-7) gives the PDE for the negative-log-likelihood function 


97 _ N + 9 

9 1 h 


9 /, 


97 


94 / '94 %94 94 


9 gj 97 

8i 


(52) 


where the (N+9)-component vector g is defined by 


N+9 

8i = I G ik 
j,k=\ 


dG jlr 


--Gii 


dJ 


Hj JK Hj 


(53) 


The left side of equation (52) is easy to evaluate; it is simply 

j x r F / f jU x - - tr(B^ 0 A) - X x k tr (B^ * A) + . . . , (54) 

where the dots denote time derivatives and the ellipses, as always, indicate terms independent of A and x 
that do not concern us. Evaluating the right side of equation (52) requires the specification of a dynamic 
model for ^ . We will choose the model of a spacecraft provided with three-axis rate information from gyros 

satisfying Farrenkopf’s noise model, with the gyro dynamics used in place of an Eulerian dynamics model 
of rigid body motion [6, 42]. The techniques we apply to this particular model can be generalized to other 
dynamic models, as was the case for measurement modeling. 

The attitude matrix obeys the SDE 

dA = -U»x]A (55) 


where the cross-product matrix is defined as 


0 -r* 


[rx]- 


-r 0 


0 -r, 

r \ o_ 


(56) 


and the infinitesimal rotation vector is given by 

70 = (w g - \ d ) dt - G v dw v . (57) 

The three-dimensional vectors <n g and \ d are the gyro rate output vector and the dynamic bias vector, a 
vector of gyro drift biases for this dynamic model. The Wiener process 7w v is the gyro drift rate noise 
process. The gyro drift rate biases are assumed to obey the SDE 


dx d =G u dy/ u , 


(58) 
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where dw u is the gyro drift rate ramp noise Wiener process . Using equation (48), equations (55) and (57) 
can be put in the vector form 

dx A = -T t a dQ = T t a [(x rf - (O g )dt + G v dw v ] , (59) 

where 

r A =[[ux] [vx] [wx]]. (60) 

Now we assume that the measurement bias parameters obey an equation analogous to equation (58) 

dx m =G m dw m , (61) 

to arrive at a state dynamics equation in the form of equation (50) with 


f = 


It will be convenient to stack the columns of the matrices B tl ^ j for i = 0,...,N in nine-component 
vectors b,, in the same way as equation (48) stacks the columns of A , and to define the 9 xN matrix 

(B = [b { b 2 b^]. 

This allows us to write the negative-log-likelihood function as 

J = jX T F^x - <t>^x - x^b 0 - x^x + . . . , 
with the partial derivatives with respect to the attitude vector 

T~- = - (fo 0 +®x) 
dx A 


(x d -ta g ) 


' r A G v 

^3x(A^-3) 

^3x3 


1 


®JV- 3 

, G = 

0(W-3)x3 

G m 

0(W-3)x3 

, and dw = 

dv/ m 

(62) 

0 3 


^3x3 

®3x(N-3) 

G u 


_dy/ u _ 



and with respect to the bias vector 

dJ 


-^=- Q,\„ +$ T *A- %*) = “ M>ll M + - 


(63) 


(64) 


(65) 


( 66 ) 


where 1\^ V (A) is defined by equation (32). We are now prepared to evaluate the right side of equation (52) 
one term at a time. 


9 9[r^(x rf — top],. 

h 3$ h 3$ 

because <*• does not appear in the ith row of V A . The next term is 

N+9 ?J 9 dJ t t 

-1 fi^r = -lfi^r=(x d -(o ) r 4 (b 0 + ®x) = (x rf -© ) 

i=l /=l °bi 

where 

^ r A (b 0 + ®x). 

Using the definitions of V A , b 0 , B , and B^ v (x) shows that £ has the components 

C, = tr{B,^(x)[e,. x]a| for f = l,2, 3. 


(67) 


( 68 ) 


(69) 


(70) 
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The fact that the ith row of G does not depend on <*• allows us to write the vector g as 


N+9 

Si = 2 G ik 
j,k = 1 


f *G jk 


~Gjk 


dJ 


Hj JK Kj 


N+9 T dJ 
= -l(GG T ) ij — . 
J = i ^ j 


Writing 


gg t = 


Qv T a ®3x(N- 3) ^3x3 

0(A'-3)x3 Qm ®(N- 3)x3 

a 


^ aQ\F A ®3xN 

. ®Nx3 Qb . 


J 3x3 u 3x(iV-3) 

with the customary process noise power spectral density matrices defined by 

Qv = G v G l » Qm = G m G m’ 30(1 Qu = G « G w > 


and using equations (65), (66), and (69) gives 

g = 


r \QX 


Qb\$t 


Then the last two terms on the right side of equation (52) are given by 

jTsi |r = -i{c r fivC+Gr IM +%,v(A) - %xfQb [+tl M +%l v(A)-/ff M X]} 
and 

, N+? 9g, , 9 9 (r ^ Q v Q t 1 g 9{&[<t>rl„ + iy lv (A) - F*„X]},. 

2 w3& 2 A d£ 2 /=i 


(71) 

(72) 

(73) 

(74) 

(75) 

(76) 


Because <*■ does not appear in the ith row of F A , the first term on the right side of equation (76) can be 
written as 


9 3 


•is S (r$ Q v \j @£j/H, ) = i tr(r^ g v r B ) = i tr(r b t\q v ) , 

*=i y=i 


(77) 


where differentiating equation (69) shows that T B \s constructed from the columns of B tl ^(x) in the same 
way that equation (60) constructs V A from the columns of A. Then matrix multiplication gives 


r B r^ = / 3x3 tr[ Aitf/x)] - AB^(x) . 


(78) 


The second term on the right side of equation (76) can also be simplified, so the right side is 

- i ' S 9 If - = i tr [B^ (x)( / 3X3 trg v - Gv Ml + 1 tr {Q h F t x w ) . (79) 

/=1 

Putting all these terms together and omitting terms that depend neither on A nor on x gives 
4 \ T F,^x - (fi^x - tr(B^ 0 A) - I x k \x{Bl^ k A) = (x d -CD g ) 7 ’C+itr[^ l (x)(/ 3x3 trG v -Q V )A] 

k = 1 

- i {C r QX + G,i m + iw M) - Qb G«im + + • • ■ • (8°) 

We must find approximations for three of the terms in the expansion of the right side of this equation 
because their functional dependence on A and x does not agree with any of the terms on the left side. 
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Equation (70) shows that the first term on the right side of equation (80) is quadratic in x and linear in A. 
We expand x about its expectation x„ M to write this term as 


(x rf -CD s ) r C = tr{AB^(x)[(x d -0> s )x]} = tr j (x)[(x d - ) x] J 


+ tr [abJ^ (x„ m )[(x d - x d ) x]} + tr | A X [ B^ j (x - x, , M ) y - ][(x d - \ d ) x] j- , 


(81) 


omitting the subscript t\ ju on \ d . We approximate A by A t , in the last term in this equation, 


introducing errors of order |x - x /(/i | ||A - A tiM || . This gives the approximation to the first term on the right 
side of equation (80) 

(\ d -M s ) r ; = tr{AB,^(x)[(x d -capx]} + trjAB^^^ -x d )x]} + (x d -x d ) r Z,| M (x-x, lM ), (82) 


where Z /)/z is the 3 xN matrix 


(^7l /X )(/ ^ (l C / ^t\fl,j) * 


(83) 


The term on the right side of equation (80) is quadratic in both x and A since £ is linear in both 

of these variables. We begin analyzing this term by writing £ as 

C = tr{[e, xlAfiJ,(x)} = tr{[e < x]ABj l (x fl/ ,)} + tr|[e i x]A £ [fl^/x-x,,^] J. (84) 

We approximate A by A, i/X in the last term in this equation and use equation (83) to give 

?/ ~ {[C/ x]AB r |^(x r |^)} + ^^(x — x r |^)] ( - , (85) 


with errors of order |x-x„ m ||A- A,^| . The symmetry of the matrix = U + S'U+ causes £ to 

vanish when x = x,^ and A = A /l/x , though, so the approximation 

W 2vC - 7 . I tr{[e ( x]AB^(x (l/1 )}(0 v )y tr{[e 7 . x]Afl,^(x (l/i )} 


'.7=1 


+ I tr{[e, x]ABi(x IlM )}[Q v Z (lM (x-x, lM )] +i(x-x, l/1 ) 7 ’zie v Z,i M (x-x II/1 ). 

1 = 1 * 


( 86 ) 


has errors on the same order of magnitude as the approximation in equation (82). The first term on the right 
side of equation (86) is quadratic in A, so we approximate it as in the measurement update by 


7 X _ tr {fi^x^He,- x]A J(Q v ),ytr|B^(x r | #i )[e^ x]a|® y v + trOFj' A) . 


(87) 


For the MAP estimator, we only need to compute the £ = 1 part V F V , which is given by equation (39) as 

3 


( v * / v)m«=f I (Qv)y ! A mn tr{fi^(x (lAJ )[e,x]A}tr{B^(x (lM )[e y x]A}c(A 

1.7=1 SO(3) 

= T X (Gv )(/ {[®i *1 , {[®/ 

i,j,k,C,p,q = 1 Kt pq 


( 88 ) 
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The term jT|£ lv (A)Q fc T|^| V (A) on the right side of equation (80) is quadratic in A, so we approximate it as 
^n\M)Qb^n\M) = l X H(Bl^ p A)(Q b ) pq lr{B^ q A) ~yr b + trC^A) , (89) 

P ^= 1 

In parallel with equation (88), we only need to compute the £ = 1 part, which is given by 

( X ^b)mn~~2 J* ^mn^\fi\v(^Qb ” ~4 ^ ( Qb^pq 2- ^t\n,p\j ^t\n,q^k£^mik^nj£ * (^) 

SO(3) PW=1 i,j,k.t=l 

Equating terms with the same dependence on A and x on the two sides of equation (80) with the above 
approximations for the first three terms on the right gives the final propagation equations for the parameters 
in the PDF, assuming that the components of \ d are the last three components in the bias vector x: 


Fn,= 




®(N-3)xN 
Zf\ n 

®(N-3)xN 

Zt\fi 


"T®yVx(Af-3) Z'\nQ v Z,^ 

x,\n + zl, (x d - Q v Z„ M X (IM ) - Q b <)), lM 


(91) 


(92) 


K®g 2 ( ^3x3 ^Qv Qv)Bf\n,0 + X 


i = 1 


" [(*<< “ Qv Z , lfl X„ M ) X]B,^ (X tifl ) + 


(93) 


B t \ fl.i - - [(®g - - | ( hx3 tr 2v — Qv )B t \^i - X (F^QhhjB^j 

7=1 

-KQvZtwMBtifiiXtin) 


for 1 < i < TV - 3 


(94a) 


Bt\fi,i [(®g *d)m W 2 ^ Fx3 ^Qv Qy)B, W X (F t \pQi,)ijB t ip'j 

7=1 


- K(2v z /l/i,/ “ e i-(A'-3)) X l %(*,!„) 


for i = (V-2,(V-l,(V, (94b) 


where z nfl i is the ith column of ^t\fl . The first two terms on the right side of equation (93) give the 
propagation of Filter QUEST with a fading memory [31], and equation (94) is similar. The interesting new 
terms in the orthogonal filter are the nonlinear coupling terms in these equations. 

AN ASIDE 


Aside from the approximations we made, one objection can be raised to the derivation of the Fokker-Planck 
equation. Computing partial derivatives with respect to \ A implicitly assumes that these parameters can be 
varied independently, but independent variations of these parameters takes A out of SO(3). To assuage these 
fears, the derivation was repeated with the quaternion ray representation of the attitude [11] 


^(q^4>=(N 2+ ^4) 1 {(?4-W 2 ) / 3x3 +2 qq T - 2 ?4[qx]}- (95) 

which guarantees that A is an orthogonal matrix for any independent variations of q and q 4 , and with the 
(/V+4)-component state vector 

q 




<74 • 


( 96 ) 


X 


This derivation also yielded equation (80), with considerably more effort. 
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COVARIANCE 


None of the computations in the orthogonal filter requires a covariance matrix. This is a disadvantage in 
practice, since the orthogonal filter provides no natural way to perform a data- independent covariance 
analysis, as is straightforward in an EKF. We can, however, compute the Cramer-Rao lower bound on the 
covariance from the Fisher information matrix [1]. 

The most useful form of the covariance matrix is in terms of the covariance of a vector 0 of infinitesimal 
rotation errors and the vector x = x - x M)v of bias errors. We write the attitude matrix as 

A = exp(- [0 x])A lrue = (/ 3x3 - 10 x] + i [0 x] 2 )^ lv , (97) 

expanding the exponential and approximating the true attitude by its estimate. Substituting this into 
equation (27), using equation (28), and expanding gives 

■ / VV 1 'v ( ' 4 ’ x) = 2 - <IV X + tr{B£ lv (x^ lv )[0 x]A M , v } 

N ^ N * „ (98) 

- I Xjtr(B lliv jA flW )+ I Xjtr{B^ v j \ex]A fl[v }-jtr{Bl w (x tlW )\exYA ll]v } + ..., 
j = 1 7=1 


where constant terms and terms of higher than second order have been omitted. Evaluating the second-order 
partial derivatives of this negative-log-likelihood function gives the Fisher information matrix 


[p 2 y/a000 d 2 j/dddx' 

L 

N* 

fa? 

1 

\ d 2 j/ 9x90 d 1 J / dx dx 

r 

yT pX 

^ fl\V r fi Iv 


(99) 


where Z^ v is the matrix defined by equation (83). The attitude information matrix F^ v is given by 

V " ^3x3 A t \^B t ^(x t ^)] ” 


(100) 


where the SVD has been used to obtain the second equality. The partitions of the covariance matrix 


P = F^ = 
1 fi\v ~ r fi\v — 


r fl\V 

KJ 


y»w 


r H\v 

P x 

fulv_ 


( 101 ) 


are computed using the standard formulas for the inversion of a partitioned matrix [35]; 

fyv = {^3x3 — Z M iv} (102) 

fyv = “ ^/zlv {^3x3 (*/!//)} J (103) 

^lv=-^yZ M | V (^v) 1 =_ {^3x3 AljU^n^(*n/x)} (104) 


The information matrix and covariance matrix only depend on the skew parts of the matrices A^ V B^ V y , so 
only on three independent linear combinations of the elements of B y . This shows that the covariance 
does not contain all the information in B^ v j , as we noted earlier. It is impossible to know which linear 
combination of the components of B^ v j appears in the covariance without knowing A ^ v , however. 
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With the dynamic model of equation (55) and (57), the (A+3)-component state vector [7] 




e 


X,,-X„ 


obeys the linearized SDE 
with 


d% = T%dt + §dw , 




®3x(N-3) 

~hx3 


G v 

®3x(N-3) 

^3x3 

T = 

0(W-3)x3 

®(N-3)x(N-3) 

®(N-3)x3 

and § = 

®(N-3)x3 

G m 

0(W-3)x3 


^3x3 

®3x(N-3) 

^3x3 


^3x3 

®3x(N-3) 



( 105 ) 


(106) 


(107) 


The covariance of this state vector £ obeys the propagation equation 

Pt\ t i = ' F ' p t\fi + %‘ FT + g§ t . 

so the Fisher information matrix, the inverse of the covariance, obeys 

F t \n — F t \^§ F t \p. 

Then the partitions of the information matrix propagate as 

= - [(co, - x d )*]F,% - Ff ¥ [(m g - x d )xf - Ff ¥ Q v F,% - Z lifl Q b Z . 

= — [(®®g — Xj)x]Z ( + ^0 3x ( A f_ 3 ) F t ^ J — F i¥ Q v Z i¥ — Z,^Q b F t ^ . 

ro, 


F,^ = 


J (N-3)xN 


+ 


[0yVx(N-3) zf\n ] ” KlyQbKly, Z I\fiQv Z i 


t\fl ■ 


(108) 

(109) 

( 110 ) 

(in) 

( 112 ) 


It is reassuring that equation (112) is identical to equation (91) of the orthogonal filter. It is also interesting 
that propagation moves the information down from F t e lfl to Z t]fl and then from Z tlfl to F t *^ . This is 

exactly opposite to the covariance propagation, in which the errors propagate up from to P t c lfl and then 
from P t c ^ to P^ . This counterintuitive behavior can be explained by the observation that knowledge of 
the attitude time dependence provides information about the biases. 

PROPAGATION OF THE MAP ESTIMATES 

Since the MAP estimation of the attitude is an orthogonal matrix, its derivative must be given by 

< 113 ) 

for some angular velocity vector ©,|^that is to be determined. Substituting equation (17) gives 

[W,I M X]=— =-(U + V? +U + vl)V + Ul =(/ + ([WyX] + [© v x])t/+ , (114) 

where 

[© t/ x] = -C/[t/ + and [® v x] = -Vjv + . (115) 

This is equivalent to the vector equation 

W,l » = V + ((Ou+0>v) ■ ( 116 > 
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Differentiating equation (15) and using equation (115) gives 

B, (i tl „ ) = £/+(- [»„ x]S' + S'-S'[(D v x])V? . (117) 

If we now define the vector P by 

IPX] = B^x^-B^ix^A ^ , (118) 

we find after a little algebra that 

[^+Px] = S'r(eo c/ + co K )x] + [((D t / +a) v )x]5' = S , [(/+©,|^x] + [(y+ro ;l#J x]S' . (119) 

But this means that 

UlV = (I 3x3 trS'-S')Ula, ¥ (120) 


or 


In order to compute P from equation (118), we need to evaluate 


N N . 

X k x t\u,k X Bt\u,k x t\u,k * 

A:=l *=1 


The first two terms are given by equations (93) and (94), and differentiating equation (34) gives 

*t\p +Qb*\i\v . 


( 121 ) 


( 122 ) 


(123) 


after substitutions from equations (91) and (92). Putting all this together gives, after several cancellations, 

jV 

X 

1=1 


= {-K®, -^)x]-A(/ 3X 3tra-fiv)}^(i,i M )+ 2 K^r 1 ^], V/ + 4/ v • (124) 


We now make two simplifying assumptions. We first assume that the gyro drift rate process noise is 
isotropic, i.e. that Q v = cr* / 3x3 , so that 


2 ( ^3x3 tr Qv ~ Qv ) “ °v ^3x3 “ 2v 


and 


4 / v = 4<T v 2 adjB,^(x /l/i ). 

where adj(.) denotes the classical adjoint [35]. We also assume that Q b is diagonal, so that 

1 = 1 

With these simplifications, substituting equation (124) into equation (118) gives 
[Px] = 4| /i B,^ i (x,| /i )[(C0 g -x d )x] + [( 0 ) s -x d )x]B tltl (x,^)Aji tl 


(125) 


(126) 


+ X K^/f/j) l i\t\n]i(Af\(iBt\fi,i ~ 2 (Qbh I adj ( B, i , A [ lfX ) — adj (A, !#1 B^j ) ] . 

1 = 1 


N 

X 

1=1 


(127) 


The ^(/ 3x3 tr<2 v - Q v )B t]fl (x t ^) and *F V terms in equation (124) give no contribution owing to the 
symmetry of . Equation (127) is equivalent to the vector equation 


P = {^3x3 “ Al//^fl/j( x /l/z)K®g — x d) — ^tl^i ~2 X (Qb^ii ^t\n,i^t\n^t\n,i * (128) 

i=l 
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We find the time derivative of r\ tlfl from its definition in equation (32). With our simplifying assumptions, 
(T|ri M )i = tr( Bj w A, lfl + Bj w A,\n ) = tr{[(*»,, p - <O g + x d )x]B^,i IlM }- + F x Q b ri fl/1 ), , (129) 


using the fact that the trace of the product of a skew-symmetric matrix and a symmetric matrix vanishes. 
This gives 



“HrlM = -Z^( - + *,)- ~ F X Q b . 

030) 

Substituting into equation (121) then yields 


with 

<a„ /J =(D g -x d + A©,| /i , 

(131) 


= P^{Z,\^(F x T l + &]%„-£! (Q b ) u B lW A^z tW } , 

1=1 

(132) 

where P® [v 

is given by equation (102). Substituting these results back into equation (123) gives 



F,*n*t\ n = . 

(133) 


We see that the orthogonal filter includes effects that are ignored by the EKF, which has ©J^ = ©^ - x^ 
and x*f =0^ with the dynamic model of equations (50) and (62). It is worth noting that the extra term in 
equation (131) bears no resemblance to the extra term found in a second-order Kalman filter [7]. 

ALGORITHM SUMMARY 

The equations of the orthogonal attitude filter can be simply summarized. The parameters F t *^ , , and 

Bt\nj specify the PDF. A priori values of ^0 and x 0 | 0 are needed to initialize the filter. Then 

^ 010 =^ 010^010 (134) 

and 

Bo\o ti for / = 1,..., N. (135) 

If there is no a priori attitude estimate, # 0 io,o is also initialized to zero. If there is an initial attitude 
estimate A^o with inverse covariance F 0 ^ 0 , B olo o is initialized to [30] 

^oio,o = [2 ( tr ^010)^3x3 “ ^010 ] A)io • (^ 36 ) 

MAP attitude and bias estimates are found by iteratively solving the following equations: 


USV T - B W Q + X k x k , 

k= 1 

(137) 

\ W =U diag([ 1 , 1 , det U det V])V T , 

(138) 

(^/xiv ” l^"(^/iiv, k^fi\v^ for k — 1, . . . , N , 

(139) 

*/ZlV “ (^ilv) [^ulv *n^lv(^lv)] * 

(140) 
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For the case of attitude determination using a triaxial magnetometer, x m is a vector of magnetometer biases 
(assumed to be in the first three components of x), and measurement information is incorporated by 


F p\n ~ F jZ\n-\ + 


F P 

0(JV-3 )x 3 


^3x(N-3) 
®(N- 3)x(A r -3) 


(141) 


%\n - 



F H\HJ = ~~ F H ' 

F H\ fij — ^/il/z-1,/ 



(142) 


(143) 

for i = 1,2,3 

(144a) 

for i > 3, 

(144b) 


where is the measurement of the ambient magnetic field, F™ is the inverse covariance of the assumed 

Gaussian measurement errors, is the magnetic field in the coordinate frame to which the attitude is 
referenced, and the coordinate axis unit vectors e, are defined by equation (47). 


For attitude propagation using the gyro-sensed attitude rate vector co^, the body frame, x d is a vector of gyro 

drift biases in the body frame (assumed to be in the last three components of x), and the parameters in the 
PDF are propagated between measurements by 




ro, 


\N-3)xN 

^ t\H 


' [°;Vx(N-3) Z?\ii ] “ F t\fi Qb F t*n ~ ^tlfiQv 


t\n 


(145) 


<t>fl M = 


®(N- 


\N-3)xN 


X,| M + (x d -Q v Z,^x, tlu )- Q b (t>„ M 


%0 - 2 (^3x3 tr 2v - Q v )B i¥ ,o + Z 

1 = 1 

-[(X d -e v Z t | M x,| M )x]B,| /1 (x ( | /i )+4 / v + ¥* 

Btlfi'i = ~ [(C®j — — -j( ^3 x 3 tr2v — — 21 (Ft\pQb\jBt\iij 

j = 1 

for 1 S I < yv - 3 


(146) 


(147) 


(148a) 


4l„,i =-[(«* -X d )X]B f|/Ji/ -4(/ 3x 3tre v -ev)V,i - I (^l M Qfc)yAl M ,y 

y=i 

" [(Gv z /i m ,i - e/_(Ar_ 3 ))xl for i = N-2,N-\,N , (148b) 

where Q v and <2z> are process noise power spectral density matrices defined by equations (72) and (73), 

~ ( z rl/i t y)/ = l r ([®/ > (149) 

and where X P V and 'Fj, are given by equations (88) and (90) in the general case, or by equations (125) and 
(126) with some simplifying assumptions about the process noise. 
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NUMERICAL EXAMPLE 


We test this algorithm for a spacecraft in a circular orbit about the Earth with inclination 35 deg and radius 
6775.19 km. All orbit perturbations are ignored, giving purely Keplerian motion. The spacecraft is assumed 
to maintain its body x (roll) axis along the velocity vector, its body z (yaw) axis along the nadir vector, and 
its body y (pitch) axis along the negative orbit normal. Then the orbit period is 92.5 minutes and the 
spacecraft angular velocity in the body frame is exactly 

CD = —(2^/5550) e 2 rad/sec. (150) 

These parameters approximate those of NASA’s Tropical Rainfall Measuring Mission (TRMM) spacecraft 
[43]. The Earth’s magnetic field is modeled by a tilted dipole [44], which for our circular orbit gives 

H(r) = // 0 [3(m • r unit )r unit - m] , (151) 


where H 0 is 25.54 |iT, r unit is a unit vector pointing from the center of the Earth to the spacecraft, and 


m = 


sin 6 m sin a m 
sin 0 m cos a m 
. cos e m 


(152) 


gives the direction of the Earth’s dipole, with 6 m = 168.6 deg and a m =a 0 + tx 4. 178 x 10 3 deg/sec. These 
orbit, attitude, and magnetic field models are very crude, but they suffice for our purposes. 


Magnetometer measurements are simulated with a bias of + 0.5 (iT and zero-mean white noise with a 
standard deviation of o m = 0.5 |iT on each axis. Actual magnetic field errors have systematic components 

[45], but these are not relevant to the present filter comparisons. The filter uses F™ = o~ 2 I 3x3 and 
Qm = ^ 3 x 3 * since the magnetometer biases are assumed to be constant. The first measurement was 

simulated when the spacecraft crossed the ascending node, when the angle between the spacecraft x axis and 
the equatorial plane is 35 deg, the orbit inclination. 

The spacecraft is assumed to have three-axis rate-integrating gyros that output an incremental angle vector 
. The RIGs are modeled with initial biases of + 0.1 deg/hr , drift rate noise of <r v = 0.3 prad/sec 12 , 

A 2/2 

and drift rate ramp noise of cr M = 3 x 10 prad/sec on each axis, which are like the TRMM gyros. The 
filter uses Q v = <J 2 I 3x3 and Q u = o 2 l 3x3 . The true gyro drift rates are simulated at discrete times by 

x d (t) = x d (t tl ) + o u (t-t fl j l/2 n u , (153) 


where the components of n u are independent zero-mean normally distributed random variables with unity 
standard deviation. The true gyro incremental angle outputs are simulated by 

1 /7 

0" G ={to + \[x d (t) + x d (t^{t-t^) + \ol(t-t^) + ±ol(t-t^ n v , (154) 

where (D is given by equation (150) and the components of n v are also independent zero-mean normally 

distributed random variables with unity standard deviation. Equations (153) and (154) give biases and 
incremental angle outputs with variances and correlations that are consistent with equations (57) and (58). 
The gyros are assumed to be read out at the measurement processing rate, so we assume the constant rate 

®«=(V 0 55 ) 

to propagate the PDF between the measurements at times t fI and t^ +l . 
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The simulation was performed using MATLAB® and its differential equation integrator ODE23. The initial 
attitude estimate at the time of the first measurement had the z axis along the nadir vector and the x axis in 
the equatorial plane, which means that the initial attitude estimate has a 35 deg yaw error. The initial 
estimates of the gyro drift bias and magnetometer bias were set to zero. The PDF was initialized with 
information equivalent to standard deviations on each axis of 40.5 deg in attitude, 0. 1 deg/hour in the IRU 
drift biases, and 0.5 [iT in the TAM biases. Iterations of equations (137)-(140) were terminated when 

Ax t F£\ v Ax< 10 -8 (156) 

where Ax is the change in the estimate of the bias state in successive iterations. Using tighter tolerances 
on the iteration convergence or different numerical integrators had no significant effect on the results. 

Figure 1 shows the estimation results using the full orthogonal filter described by equations (134)-(149). 
Figure 2 shows the estimation results using the same filter except that the last term is deleted from the 
right side of equation (147). The results shown in Figures 1 and 2 are very close for the first 150 minutes of 
the simulation; the attitude estimates agree to within 0.01 deg, the IRU drift bias estimates to within 0.01 
deg/hour, and the TAM bias estimates to within 1 nT. The pitch attitude estimates shown in Figure 1 are 
more accurate than those in Figure 2 at later times, but the estimator with x ¥ b included becomes unstable in 
roll and yaw. This behavior probably results from the approximations required to derive equations (91)-(94) 
from equation (80). It is possible that retaining the i = 2 contribution neglected by equation (89) would 
stabilize the orthogonal filter, at the expense of increased complexity. Further investigation shows that only 
the skew-symmetric part of the product causes the divergence. The approximation in equation (87) 

turns out to be harmless because A t \J¥ T v is symmetric, as was pointed out below equation (127). 

The same simulated data were used in an EKF employing the same measurement and process noise 
assumptions and the same initial conditions. The EKF, like the orthogonal filter, converges to useful 
estimates in about one orbit (92.5 minutes), a typical convergence time for magnetometer-only attitude 
estimation [45]. Because the EKF is a linearized algorithm, its estimates differ from those of the orthogonal 
filter at early times when the estimation errors are large. The EKF estimates agree quite well with those 
shown in Figure 2 at later times, however; the differences after 150 minutes being less than 0.003 deg in 
roll and yaw, 0.05 in pitch, 0.01 deg/hour in IRU drift bias, 1.5 nT in roll and yaw TAM bias, and 8 nT in 
pitch TAM bias. The smaller pitch estimation errors in Figure 1 compared to Figure 2 allow us to hope 
that the orthogonal filter would be more accurate than the EKF if its instability could be corrected. 

CONCLUSION 

The filter presented in this paper solves for the attitude matrix directly, along with an ^-dimensional vector 
of bias states, by means of a new parameterization of the probability density function (PDF) of the 
combined attitude/bias state on the Cartesian product of SO(3) and . The negative-log-likelihood 
function, the negative of the logarithm of the PDF, is expressed as a quadratic function of the biases and, in 
parallel with the QUEST algorithm, a linear function of the attitude matrix. Maximum a posteriori 
probability estimates of the attitude matrix and the bias parameters are found by minimizing the negative- 
log-likelihood function. This filter can thus be seen as a generalization of the filter QUEST algorithm. 

The filter uses Bayes’s formula to incorporate measurement information, and it can accommodate more 
general measurement models than filter QUEST, in principle. The attitude matrix, which is equivalent to 
the lowest order £ = 1 nontrivial irreducible representation of SO(3), is adequate for the simple measurement 
model considered in this paper, but more realistic measurement models may require the £ = 2 irreducible 
representation to be included in the PDF, at the expense of increased computational complexity. 

This filter uses the Fokker-Planck equation to propagate the PDF between measurements, so it incorporates 
process noise with more fidelity than does filter QUEST. The nonlinear Fokker-Planck equation introduces 
process-noise-dependent corrections to the implicit propagation of the estimates of the attitude and biases 
that are not considered by the extended Kalman filter. 
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TAM bias errors (microteslas) IRU bias errors (degrees/hour) attitude errors (degrees) 








Figure 1. Orthogonal filter estimation 
errors and 3 a error bounds. 
blue=roll, green=pitch, red=yaw 


Figure 2. Orthogonal filter estimation 
errors and 3 a error bounds with ¥*=0. 
blue=roll, green=pitch, red=yaw 
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Because the Fokker-Planck equation is quadratic in the negative-log-likelihood function, it also introduces 
the higher-order £-2 irreducible representation into the propagation. This order is not included in the 
PDF, so it must be ignored. Unfortunately, the resulting filter diverges without an ad hoc modification, and 
the modified filter performs no better than the extended Kalman filter. The neglect of the £ - 2 terms is the 
most probable cause of the problem, so it is possible that adding £ = 2 terms to the PDF would cure the 
instability. If £ = 2 terms are included in the PDF, however, the nonlinear Fokker-Planck equation will 
introduce £ = 3 and £ = 4 terms that would have to be ignored. It appears that no PDF including a finite 
number of irreducible representations of SO(3) can provide an exact solution of the Fokker-Planck equation, 
and it remains to be seen if a consistent convergent algorithm can be found. 
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APPENDIX: THE ROTATION MATRIX D\A) 


We would like to find a relationship similar to equation (9) for D 2 {A ) . Consider the real 6x6 matrix 


R(A) = 


r ul 
L r ll 
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LR _ 


(Ala) 
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(Aide) 


where A is a member of SO(3). It is easy to see that 

R(A T )=R T (A) (A2) 

and it can be shown by direct computation that R(A) is a (reducible) representation of SO(3): 

R(AA') = R(A)R(A'). (A3) 
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These two properties taken together show that R(A) is an orthogonal matrix, and it is not difficult to show 
that 


/?(exp[-0x]) = exp 


^3x3 

3/6 diag(0)[u 3 x] 


V6[u 3 x]diag(0) 

tex] 


(A4) 


where u 3 is the unit vector 

u 3 =(l/V3)[l, 1, if. (A5) 


Substituting equation (9) into 

Dl'm(A)= 1 C^ m , C l m % m (A) Dl im2 (A) (A6) 

m[ m 2 
m\ m 2 


where denotes the Clebsch-Gordan (or Wigner) coefficient [27, 29], and exercising the symbolic 

manipulation capabilities of Mathematica® gives the desired relation between D 2 (A) and R(A): 

D 2 (A)=m\R(A)M 2 , 

where 

-1/2 0 l/V6 0 -1/2 

1/2 0 i / a /6 0 1/2 

0 0 - 2/^6 0 0 

0 i/n /2 0 i/3/2 0 

0 1/3/2 0 -1/3/2 0 

-i/3/2 0 0 0 i/ 3/2 _ 

The matrix M 2 obeys the relations 

M]M 2 = / 5x5 and Af 2 A^ = / 6x6 - u 6 u£ , 

where 


Af 2 = 


Ua = 


L 0 3 j 


(A7) 


(A8) 


(A9ab) 


(A 10) 


is the basis for the one-dimensional null space of M 2 . The orthogonality of A can be used to show that u 6 
is an eigenvector of R(A) with unity eigenvalue: 

R(A) u 6 =u 6 , (All) 


which leads to the relationship 

R(A) = M 2 D 2 {A)M\ + u 6 u£ , (A 12) 

showing that R(A) contains t = 0 and £ = 2 components, but no t = 1 component. 
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An £ = 2 contribution to the real negative-log-likelihood function can be written in terms of a 5x5 array 
x F £= 2 of complex coefficients as 

=$tr[V\ =2 D\A)] = \tr[V\ =2 MlR(A)M 2 ] = ±ti[0 T R(A)], (A 13) 

where 

Q=M 2 '¥ e=2 M i 2 (A 14) 

is a 6x6 array of real coefficients. Because N^a 6 = 0 5 , this array obeys the constraints 

©u 6 =0 r u 6 =O 6 , (A 15) 

which mean that it has the 25 independent parameters appropriate to an £ = 2 representation of SO(3). 
Violation of the constraints causes no problem; it merely mixes a superfluous £ = 0 term into the 
negative-log-likelihood function. 

Now we use these results to represent the term ignored by the approximation of equation (38), which is 

IrF™ = ±tr (tf. A T F m A ) , (A16) 

where FC and F m are the symmetric traceless matrices 

-i|H p | 2 / 3x3 and F m = F™ -±(tr F M m )/ 3x3 . (A17) 

Using the symmetry of Q-C and F m , equation (A 16) can be represented in the form of equation (A 13) with 
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(A18de) 


Note that 0 obeys equation (A 15), because FC and F m are traceless. The neglected terms in equations (87) 
and (89) can be similarly represented as £ = 2 terms. 
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